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To : Di St r i bu t i on 

From: Robert S. Coren 

Date: Q\fllJ7b 

Subject: Canoni ca I i 2a t i on of Terminal Input 



In theory^ terminal input to Multics is converted by the 
ring-zero typewriter DIM to "canonical form'V i. e./ the physical 
appearance of a line uni<}uely defines the form in which it will 
be stored, in addition, well-defined meanings are attached to 
input streams containing erase, kill, and escape characters. 



In actual fact^ the current typewriter DIM does not meet the 
goals described in the preceding paragraph. The three basic types 
of canoni cal i za ti on (column assignments erase/kill* and escape) 
are each handled more or less correctly/ Dut the current design 
does not lend itself to correct and consistent processing of 
combinations of canoni ca I 1 za t i on types. The trouble is that the 
three types are handled more or less simultaneously. Thus the 
final input resulting from strings such as "\027", "\Q\bU?" * 
"IQUQ"' "*025"(r etc., is not predictable under the current 
implementation. 



A redesigned, more efficient version of tty_read is planned 
for Multics release 4,0; in the course of the new design, 
canon i cal i zat i on will be cleaned up and made consistent. The 
details of this new design will be discussed in a future iXTQ; the 
purpose of the present document is to set forth a complete 
description of the rules of canoni cal i zat i on that the new 
tty^read will implement. It is proposed that the rules described 
here be adopted as a standard for all situations in Multics where 
canon i Cal i zat i on is required. 



tfiiJi}J\|KAi.lIAIi2N-fiUL£i 



The three types of c anoni ca L i za t i on named above must be 
performed separately in a defined order/ to ensure consistency 
and predictability. In particular, the canon i ca I i za t i on process 



Multics Project internal working documentation. Not to be 
reproduced or distributed outside the Multics Project, 
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is conceptually divided into the following steps: 



1. If the terminal is in "can" mode* perform 
column-assignment canon i ca I i zat i on on the typed input. 

2. If the terminal is in "erkl" mode* perform erase/ kill 
c anon i ca I i 2at i on on the result of step 1. 

3. If the terminal is in "esc" mode* perform escape 
c anoni ca I i za t i on on the result of step 2, 



Of course* the actual implementation does not necessarily have to 
perform the three steps in sequence* provided that the result is 
the same as would have been achieved by doing so. 



The three types of canon i cal i za t i on are discussed in more 
detail below. If two or more of the rules listed below are 
applicable to a given input string* they are applied in the order 
in which they are presented here. 



COLU:iN ASSIGNMENT 



This phase is concerned with determining which printing 
graphics* if any* appear in each physical column position. This 
is determined according to the following rules. 



2ui£5_FaL_lil£_iai££:Qll£tdtii2Q_Qi_iQCUl_i.tia£i£i££S 



1. The leftmost position of the carriage is considered to be 
column 1 . 

2. Each printing graphic or space typed increases the column 
po s i t i on by 1 . 

3. Each backspace typed decreases the column pos i t i o n by 1 
unless the column position is 1. 

4. A carriage return sets the column position to 1. 

5. A horizontal tab increases the column position to the 
next tab slop/' tab stops are defined to be at columns 11* 



Multics Project internal working documentation. Not to be 
reproduced or distributed outside the Multics Project. 
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?U 31/ etc, 

A newlinc/ form feed* or vertical tab sets the column 
position to 1 and advances the carriage vertical ly« thus 
no character typed after such a character can share a 
column position with a character typed before it. 



euiis-lQ£_ih£_£acji!atiQa_Qi«tlifi«£jDflQ.Udi_Slxixi3 



7. Characters on each line are sorted so that their 
associated column positions are monotone increasing. 

8. No carriage return characters may appear in the canonical 
s t r i ng. 

9. A horizontal tab is preserved as typed unless a printing 
graphic appears in one of the columns skipped by the tab* 
in which case the tab is replaced by an appropriate 
numberofs paces. 

10, Backspaces appear in the canonical string only when two 
or more printing graphics share a column position, 

11, When two or more different printing graphics share a 
column position* the characters are sorted as follows: 
graphic with lowest numeric ASCII code* backspace* 
graphic with next lowest numeric ASCII code* etc, 

1 ?, If the contents of a column position consist of two or 
more instances of the same printing graphic* that column 
is reduced to a single instance of the graphic. 

13. A line-ending character (newline* form feed* or vertical 
tab) immediately follows the last printing graphic in the 
rightmost column position on the line. 



ERASE AND KILL CHARACTERS 



The placement of erase/kill canoni c al i za t ion after 
column-assignment c anon i ca I i za t ion and before escape 
c anon i ca I i za t i on is strategic in that it causes erase/kill 
processing to work by caiuOIIl aQSiliflQ rather than by ^haLALtez- 
This eliminates ambiguity with respect to erase characters 
combined with escape sequences. (See the examples at the end of 
this document . ) 
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The rules for erase and kill canoni ca I i zat i on are given 



below. 



14, An erase character alone in a column position results in 
the deletion of itself and of the contents of the 
preceding column position, 

15, An erase character alone in a column position and 
preceded by more than one blank column results in the 
deletion of .dli immediately preceding blank columns^ as 
wellasoftheerasecharacter. 

16, An erase character sharing a column position with one or 
more printing graphics results in the deletion of the 
contents of that column position. 



17. A kill character results 



^r^ 



the deletion of its 



own 



column position and 
unless it shares a 
character* in which 
character is erased). 



all column positions to its left* 
column position with an erase 
case rule 16 applies (the kill 



'esc" mode* 



a n 



erase or kill 



18. If the terminal is in 

character alone in a column immediately preceded by an 
escape character atone in a column is not processed as an 
erase or kill character. 



Note that for rule 18 to apply* the erase or kill character must 
actually have been typed in the column immediately following the 
escape character. The reason for this is that it facilitates the 
erasing of escape sequences* e.g.* \001##*#. 



ESCAPE SEQUENCES 



The processing of escape sequences is performed according to 
the rules given below. 



19, An escape sequence consists of an escape character alone 
in its column position followed Oy one or more printing 
graphics each of which is alone in its column position. 
An escape sequence is replaced by a single character in 
the canonical string- 



20. An escape sequence consisting of two successive 
characters is replaced by an escape character. 



es cape 
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21, An escape sequence consisting of an escape character 
followed by an erase (or kill) character is replaced by 
an erase (or kill) character, 

Z2, An escape sequence consisting of an escape character 
followed by one* two* or three octal digits is replaced 
by the character whose ASCII value is represented by the 
sequence ofoctaldigits. 

23, An escape character followed by a newline character 
results in the deletion of both characters from the 
canonical string, 

2A. Other escape sequences may be defined on a 
per-ter mi nal-type basis* where such a sequence consists 
of an escape character and one character following, 

25. If the character following an escape character does not 
result in an escape sequence as defined by rules 20-24* 
the escape and following characters are stored as they 
appear on the line. 



In the examples below* the following conventions are used: 

<NL> "represents a newline 

<CR> represents a carriage return 

<BS> represents a backspace 

<HT> represents a horizontal tab 

<SP> represents a space 

Cnnn} represents a character whose ASCII value is 
nnn <octal) 

\ is the escape character 

M is the erase character 

a is the kill character 



Ihe examples in the first group illustrate how various typed 
sequences are canoni ca I i zed in terms of column position; these 
are followed by examples of erase* kill* and escape 
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canon i ca I i zat i on. In the second group* lines are shown as they 
appear physically* with no consideration given to the precise 
sequence of keystrokes that might have produced them. 



COLUMN CANONI CAL JZ AriON EXAMPLES 

L£a!DQi£.1 

Typed: Nothing special about this line.<NL> 

Appearance: Nothing special about this line. 
Result: Nothing special about this line,<NL> 

lyoed: Extraneous white s<SP><BS>pace is i gnored, <CR ><SP><NL> 

Appearance: Extraneous white space is ignored. 
Result: Extraneous white space is ignored. <NL> 

Typed: Two ways (2<BS>_) to over st r i ke, <CR> <NL> 

Appearance: Iuq ways (i) to overstrike. 

Result: T<BS> <BS>w_<BS>p ways (2<BS>_) to overs t r i k e. <NL> 



Typed: Tab + backspace i s<HT><BS>reduced to spaces. <NL> 

Appearance: Tab + backspace is reduced to spaces. 

Result: Tab + backspace i s< SP><SP><SP><SP>reduc ed to spaces. <NL> 
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(See rule 9. ) 

ERASE-KJLL fi.m ESCAPE EXAMPLES 

Appearance: abz#cde 
Result: abcde 

Appearance: ab #cde 
Resu I t: abode 

Appearance; NotSNever oBn Sunday, 
Result: Never on Sunday. 

Appearance: HQjLifM it's right. 
Result: Hq^ it's right. 

Appearance: HQJiiu it's right. 

Result: NfljjM it's right. 

(Erase character is overstruck; see Rule 16.) 
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Appearance: del rrs char (1) static i ni t (" \0 t ?ff 6") ; 
Result: del rrs char (1) static i ni t ("lOI 6) " ) ; 

L2Ssi!Dfii£_IJ 

Appearance: \02^ 

Result: {002>i 

(Overstruck 3 is not part of escape sequence.) 

Appea r anc e : ^112 
Resul t: ^1 1 2 
(Overstruck \ is not an escape character.) 

Appearance: a\##b 

Resu I t : a\b 

(First U is not an erase character by rule 18; second tt erases 
itself and preceding # by rule 14.) 

tidmciS-iii (similar to kxample 13) 

Appearance:a\3#b 
Result: a\b 
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Appearance: aJOb 

Resul t : b 

(The \ is erased by the overstruck #.) 

Appearance: a\\#b 
Result: a\#b 

(Erase canon i c a I iz at ion does not recognize the # by rule 18; 
escape ca non i c a I i z at i on recognizes \\ by rule 20, and attaches no 
special meaning to the #.) 



Appearance: a\\##b 

Resul t : a\b 

(fly rule 18/ the first # is not an erase character; by rule 14/ 
the second # erases itself and the preceding #; IhSD rule 20 
reduces \ \ to \, ) 



Appearance: aWHtfUb 
Resul t : a\b 

(The first # is not an erase; the next two are/ erasing the 
second \ and the first #.) 
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Appearance: a\\####b 
Resu 1 1 : ab 

(The first # is not an erase/ and must be erased before the two \ 
characters. Examples 16-19 illustrate the difficulty of erasing a 
double \; the clearest method is probably to overstrike (aK*b).) 

£i«aJSCl£_iD (on 2?41-like terminal) 

Appearance: at<Wb 

Resul t : a\b 

(Only the < is erased; t is translated to \.) 



